Frequency map selection using a RBFN-based classifier in the MVDR beamformer for speaker localization in reverberant rooms
نویسندگان
چکیده
We present the weighted minimum variance distortionless response (WMVDR), which is a steered response power (SRP) algorithm, for near-field speaker localization in a reverberant environment. The proposed WMVDR is based on a machine learning approach for computing the incoherent frequency fusion of narrowband power maps. We adopt a radial basis function network (RBFN) classifier for the estimation of the weighting coefficients, and a marginal distribution of narrowband power map as feature for the supervised training operation. Simulations demonstrate the effectiveness of the proposed approach in different conditions.
منابع مشابه
A New Post-filter Algorithm Combined with Two-step Adaptive Beamformer
The optimal microphone array, in the sense of minimum mean square errors (MMSE), includes two processing blocks: the minimum variance distortionless response (MVDR) beamformer and the single-channel Wiener filter, which acts as post-filter. In this paper, we propose a new post-filter algorithm based on assumptions that both the noise power attenuation factor (NPAF) and signal power attenuation ...
متن کاملLinear Prediction-based Dereverberation with Advanced Speech Enhancement and Recognition Technologies for the Reverb Challenge
This paper describes systems for the enhancement and recognition of distant speech recorded in reverberant rooms. Our speech enhancement (SE) system handles reverberation with blind deconvolution using linear filtering estimated by exploiting the temporal correlation of observed reverberant speech signals. Additional noise reduction is then performed using an MVDR beamformer and advanced model-...
متن کاملAdaptive beamforming and soft missing data decoding for robust speech recognition in reverberant environments
This paper presents a novel approach to combine microphone array processing and robust speech recognition for reverberant multi-speaker environments. Spatial cues are extracted from a microphone array and automatically clustered to estimate localization masks in the time-frequency domain. The localization masks are then used to blindly design adaptive filters in order to enhance the source sign...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کامل